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Abstract 

Global voting schemes based on the Hough transform (HT) have been widely used to robustly detect 
Unes in images. However, since the votes do not take line connectivity into account, these methods do 
not deal well with cluttered images. In opposition, the so-called local methods enforce connectivity but 
lack robustness to deal with challenging situations that occur in many realistic scenarios, e.g., when line 
segments cross or when long segments are corrupted. In this paper, we address the critical limitations 
of the HT as a line segment extractor by incorporating connectivity in the voting process. This is done 
by only accounting for the contributions of edge points lying in increasingly larger neighborhoods and 
whose position and directional content agree with potential line segments. As a result, our method, 
which we call STRAIGHT (Segment exTRAction by connectivity-enforcInG HT), extracts the longest 
connected segments in each location of the image, thus also integrating into the HT voting process 
the usually separate step of individual segment extraction. The usage of the Hough space mapping and 
a corresponding hierarchical implementation make our approach computationally feasible. We present 
experiments that illustrate, with synthetic and real images, how STRAIGHT succeeds in extracting 
complete segments in several situations where current methods fail. 
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I. Introduction 

Line segments are fundamental low-level features for the analysis of many real-life images. In fact, 
most man-made objects are made of flat surfaces, originating images with edge maps composed by line 
segments. Thus, these segments provide important information about the geometric content of the imaged 
scene. This has been exploited, e.g., for localizing vanishing points llH or to match line segments across 
distinct views [2]. Since more elaborated shapes are often described in an economic way in terms of line 
segments, their extraction is often a first step in many other problems, e.g., rectangle detection Q, the 
inference of shape from lines [4J, map-to-image registration [SJ, 3D reconstruction ||6l, or even image 
compression Q. 

In this paper, we propose a new method to extract line segments from images in an automatic way. 
Although this problem has been the focus of attention of several researchers in the past decades, current 
solutions do not cope with many challenging situations that arise in practice, as we detail in the sequel. 
For this reason, the robust detection of line segments remains an open frontier (see HI, ||9l for examples 
of recent advances). 

A. Overview of methods for line segment extraction 

The Hough transform (HT) ifTOl . ifTTl is the most popular method to detect lines in images. Basically, 
the HT extracts the lines that contain larger number of edge points. The key to an efficient implementation 
is the usage of the Hough space, a two-dimensional space parameterized in such a way that each point 
in this space represents a line in the image. Each edge point in the image is then mapped to the region 
of the Hough space that represents the pencil of all the image lines that go through that edge point. By 
processing all edge points, the votes for each location in the Hough space are accumulated (this space 
is also referred to as the accumulator array) and the locations with larger number of votes correspond 
to the parameterizations of the predominant lines in the image. Naturally, the success of the HT comes 
from its global nature, since all points in a line contribute to its detection. 

A strong limitation of the HT comes from the fact that the voting scheme does not take into account 
that lines are alignments of connected points. In fact, since edge points voting for a particular line may 
be disconnected from each other, there is not guarantee that parameters receiving large numbers of votes 
correspond to long lines in the image (due to textures and/or noise, a peak in the accumulator array may 
even correspond to the parameterization of a "false" line, i.e., a line that collects many separate points 
but is not at all perceived as a line in the image). This problem is particularly critical when using the 
HT to extract a line segment, i.e., a rectilinear point alignment with length that can be much smaller than 
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the image dimensions. In first place, short segments originate small peaks in the Hough space, which are 
hard to identify |[T2l : besides, a non-trivial extra step to determine the segment start and end points is 
required, e.g., the analysis of the shape of the spread of votes in the accumulator array fT3l - 

Fig. [T] illustrates the difficulty that may arise when using the HT to detect short line segments. We 
contrast the HT of a synthetic image with long line segments (top left) with the one of a complex real 
image (bottom left). In the first case, since the number of lines is not very large and they are very long, the 
HT accumulator array exhibits the expected peaks (darker points visible in the top middle image), which 
are easily detected, and correspond to the correct lines (top right). In the second case, the large number 
of edge points in the real image, many of them forming very short segments or only corresponding to 
texture or noise, originates large numbers of votes for lines that do not correspond to connected segments. 
As a consequence, the accumulator array does not exhibit prominent peaks (see the smoothness of its 
grey-level representation in the bottom middle image) and its processing originates a poor result (bottom 
right) that contains spurious short segments and misses several others, including longer ones. 




Fig. 1. Top; success of the HT when detecting long lines in a clutter-free image. Bottom: limitation of the HT when detecting 
short line segments in a complex image. In each case: on the left, the original image, on the middle, the HT accumulator arrays, 
and on the right, the detected line segments. 

The limitations of the HT have been pointed out by several authors, e.g., |[T2l . |[T4l . ifTSl . |[T6l and 
many efforts have been made to alleviate its problems. For example, the method in ifTTl uses the edge 
direction to reduce the accumulation of spurious votes in the Hough space and fVH proposes a strategy 
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that is based on processing a line a time: after detecting the line corresponding to the largest peak in 
the accumulator array, the votes of the edge points that belong to this line are removed. Naturally, the 
success of this approach hinges on the correctness of the first lines detected. As a consequence, in many 
practical situations, when facing highly textured images and/or in the presence of noise, these approaches 
are unable to provide accurate results. In fact, this happens because these improvements do not tackle 
the fundamental limitation of the HT: the fact that it does not take into account the required connectivity 
of the edge points forming a line segment. Other authors addressed storage and computational issues of 
the HT by proposing a hierarchical scheme ifTSll . multiple accumulator resolutions ||T9l . or the usage of 
a random sampling of the edge map (probabilistic HT) |[20ll . 

In spite of these limitations, the global nature of the HT is attractive, since line parameters estimated 
from the complete data are naturally more accurate than what can be done locally. However, few papers 
have approached the problem of developing global methods for line segment extraction, i.e., global 
methods to extract, simultaneously, the line parameters and its extremes. A fruitful example is the method 
in II2II . which searches among all possible line segment candidates, using a Helmholtz principle for 
validating. Naturally, the good results come at a high computational cost. Other approaches use the 
widely known random sample consensus (RANSAC) 1221 : basically, two edge points from the edge map, 
randomly sampled, define a candidate line; then, the consensus of the line is evaluated by counting the 
number of other edge points that fit that line segment, given an error tolerance; for segments with a high 
consensus, the parameters are refined by using an iterative expectation-maximization (EM) method ||23]| . 
Since the success of RANSAC hinges on the usage of a very large number of samples, these approaches 
result computationally too complex for many realistic applications. 

For the reasons above, the majority of methods for line segment extraction rely on local decisions, 
rather than on global ones, see |[24l . ll25l . |[T4l . ll26l . |[T6l for examples. These local methods outperform 
the HT by taking (local) connectivity into account, and result computationally simple, but lack robustness 
to deal with challenging situations, e.g., when line segments cross. Furthermore, their local nature make 
long line segments particularly difficult to extract in many realistic scenarios, because, due to noise and 
clutter, these segments are interrupted. The majority of local methods use three steps: first, obtaining a 
region of connected edge points; then, roughly estimating the line segment direction; and finally, refining 
and extending the segment by including new edge points that approximately fit the line. The first step 
consists of chaining edge points ||27ll . Methods such as the one in |28| even skip the subsequent line 
fitting and refinement steps by chaining connected edge points into curves and then cutting them into 
line segments, using a straightness criterion. Texture, low-contrast regions, crossing segments, and noise 
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make difficult the extraction of large connected regions belonging to a single segment. The second step 
consists of fitting a line to the chain of edge points using, e.g., total least-squares (TLS) ||26l . Naturally, 
the reliability of the regression depends on the length of the underlying point chain. Some methods 
bypass the chaining of edge points: [29 1 uses the so-called local HT (LHT) ll30l . roughly estimating the 
segment direction from the peaks of local orientation histograms, computed at each edge point; |[T4|| . |[3T| 
directly fit a line to all edge points inside a sliding window, which only provides reasonable estimates 
for simple scenes, with very small clutter. The final step usually involves alternating between two stages 
until convergence |26|: inclusion of new edge points that are close to the candidate line, according to a 
distance measure; and re-estimation of the line segment parameters from the new set of edge points. As 
it is typical with this type of methods, a poor initial model for the line segment model may compromise 
the final result. Furthermore, the process may terminate too early when attempting to extract a long line 
segment, due to the common cluttered nature of the edge maps of real images. Two popular local methods 
for line segment detection are [25] and the LSD (Line Segment Detector) of [16|. The method in [25] 
coarsely quantizes the local orientation angles, chains adjacent pixels with identical orientation labels, 
and fits a line segment to the grouped pixels. LSD |[T6ll extends this idea by using continuous angles and 
eliminates false line segment detections by using the Helmholtz principle of ||2TI . 

B. Proposed approach 

The key ingredient of our method is the incorporation of connectivity into the HT voting process. 
This is done by imposing that edge points only vote for lines according to which they are spatially 
connected to other points. As a consequence, the vast majority of spurious votes are eliminated and 
peaks in the accumulator array become prominent and truly correspondent to line segments of maximum 
length. Simultaneously, our method, which we call STRAIGHT (Segment exTRAction by connectivity- 
enforcInG Hough Transform), integrates into the voting process the usually separate step of determining 
the extremes. 

STRAIGHT starts by computing the prominent directions at each edge point, which will guide the 
search for the orientations of line segments. An image line segment is characterized by a rectilinear 
alignment of dark-to-light (or opposite) transitions. Our prominent direction detector computes, for each 
edge point, the set of directions according to which there is a predominance of those intensity transitions. 
This is accomplished by extending the LHT ||30l to only take into account directionally coherent edge 
points: in a first step, signed directional edge maps are computed; then, orientation histograms are built at 
each edge point, by considering the neighboring edge points whose relative positions agree with the angle 
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of the directional edge map. The histogram accumulates the signed values of the intensity transitions, 
thus the prominent directions at each edge point are detected by finding large magnitude entries in the 
corresponding histogram. 

After computing the local prominent directions, STRAIGHT extracts line segments using the knowledge 
that the edge points forming each of them must be connected. This is done by computing new LHT- 
like maps (which we will call length maps) for each edge point, this time taking into account all other 
edge points whose position and directional content agree with potential line segments. Position matters 
because only points that respect connectivity are considered; directional content matters because only 
edge points with prominent direction that agrees with the candidate segment are considered. In practice, 
for each prominent direction of each edge point, STRAIGHT progressively considers edge points further 
away until the connectivity criterion is violated. After exploring all candidate directions, the ones that 
collected more distant edge points correspond to the orientations of the longest connected line segments 
going through the starting point. Note that allowing a set of prominent directions, rather than a single 
one, enables dealing with crossing segments. 

Our implementation of STRAIGHT incorporates the explicit mapping of uncertainty balls around the 
edge points into the Hough space, increasing robustness and accuracy, and uses a hierarchical coarse-to- 
fine strategy to explore candidate directions, leading to a computationally tractable algorithm. We present 
illustrative results of experiments that use synthetic and real images to compare STRAIGHT with the 
HT im and the state-of-the-art local method LSD ifTSI . A preliminary version of this work is in |[32l . 



C. Paper organization 

The organization of the remaining of the paper is as follows. Section [IT] describes the computation of 



local prominent directions. Section III details the extraction of line segments by enforcing connectivity. 
The hierarchical implementation is described in Section IV Experimental results are reported in Section [V] 



and Section VI concludes the paper. 



II. Computing Local Prominent Directions 

In terms of image intensities, the existence of a line segment corresponds to the presence of a rectilinear 
alignment of dark-to-light (or opposite) transitions. Despite the multiple sources of inaccuracies in edge 
detection, such as noise and clutter, there should be a predominance of either positive or negative 
intensity variations (corresponding to light-to-dark and dark-to-light transitions, respectively) along the 
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line segment. This predominance is what we exploit to compute the prominent directions at each edge 
point, i.e., the set of possible orientations of line segments going trough that point. 

We capture the local directional content of an image / by computing its derivatives, through the 
convolution with four oriented kernels. 



VeI = I*Kg, G {0°,45°,90°,135°} 
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Although kernels with a large support would smooth the noise, we use simple standard central difference 
kernels, since they enable more precise edge localization and angular responses, by minimizing the 
influence of surrounding pixels. Thus, we set 



Kn 



In our case, robustness to noise when extracting line segments comes from the requirement of point 
connectivity, as detailed in the following section. 

The directional edge maps Eg, 9 G {0°,45°,90°, 135°}, are obtained by thresholding the derivatives 
and retaining their sign, i.e., 

1 ifVeI(x,y)>T 
-1 if Vel{x,y)<-T 
otherwise . 

We call {x,y) an edge point when \EQ{x,y)\ = 1 for at least one value of 9. 

We extend the LHT ll30ll to take into account the edge directional content, i.e., our method builds local 
orientation histograms by using the neighboring edge points whose direction is coherent with its position. 
To clarify, when building the histogram for the edge point (xo,yo)> we first compute the directions of 
the segments passing through (xo,?/o) and each neighboring edge point {x,y), 

y-yo 

X - Xq^ 

Then, for each neighbor (x, y), the histogram count uses only the directional edge map Eq with the value 
of 9 that is closer to direction perpendicular to 0(xo,yo) i^^v)' 

Egoix, y) if 9^^^^y^) (x, y) G [0, 22.5] U (155.5, 180) , 
£;i35(x,y) if G (22.5,67.5], 

E^{x,y) if 0(,„,j,,)(x,y) G (67.5,112.5], 
Ei->{x,y) if 9 

(^o,yo) (■^' ^ (112.5, 155.5] , 



S{xo,yo) {x,y) = arctan 



G [0°,180°) 
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which just corresponds to the representation in Fig. [2] For example, if (xq, yo) = (0, 0) and (x, y) = (0, 3), 
we have ^(o,o)(0>3) = 90°, a vertical segment, and the directional edge point that contributes to the 
histogram is Eo{0, 3), since Kq is the kernel that best responds to the horizontal transitions that define 
vertical edges. 




Fig. 2. Selection of directional edge map Ee in terms of the relative position between edge points. 



As usual in the LHT, the number B of histogram bins is fixed {B = 32 is typical) and each edge point 
contributes to the two bins whose centers approximate the angle 6(xo,yo) v) excess and default (the 
contributions are weighted according to the distance to the bin centers). The signs in the directional edge 
maps are taken into account through positive or negative contributions to the histogram bins. This way, 
we filter out conflicting contributions that may occur due to noise, textures and image clutter. We use 
a relatively large neighborhood for the local histograms (a 7-pixel radius circular window) to minimize 
the influence of the spurious points in the edge maps. The prominent directions at each edge point are 
found by thresholding the magnitude of the corresponding local histogram (we use the threshold of 50% 
of the number of edge points inside the circular window for each direction, thus a direction is considered 
prominent when the majority of the edge points in that direction have the same intensity transition sign). 

To represent the range of possible directions going through each edge point (xq, yo), we store the set 
of prominent bin centers and the corresponding image gradients, i.e., 

Q(.XQ,yo) = {{di,VeJ{xo,yo)) , (02, Ve, J(xo, yo)) , • • • , (^'a^, V^^ J(xo, yo))} , 

where < < is the number of histogram bins whose count was above the threshold. If > 1, for 
1 < n < A^, we denote by 9n the central angle of the bin n and by V6i„/(xo, yo) the image gradient Q, 
with orientation 6 that best matches according to (|2]). To account for the histogram discretization, 
i.e., the nonzero width of the bins, we consider in the sequel as possible directions of line segments all 
the orientations Be [On- A0,en + Ae], with = 180/S/2 = 90/ B. 
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III. Extracting Connected Line Segments 



We now describe the core of STRAIGHT, i.e., the way we incorporate connectivity into the Une segment 
extraction. We start by making expUcit the parameter search problem that underlies the extraction of each 
of the segments, introducing the length map, which plays a similar role of the HT accumulator array. Then, 
we describe how edge points are sequentially mapped into the length map by taking into account both 
the edge point connectivity and the uncertainty due to discretization. Finally, we describe the procedure 
to extract line segments from the length map. 

A. Line segment extraction as a parameter search problem — the length map 

Let Pq = {xq, uq) be an edge point and @{pq) the set of prominent directions of possible line segments 
passing through Pq, as introduced in the previous section. To accurately detect a line segment in the image 
it is necessary to estimate the sub-pixel location of the line, which will be parameterized by its position 
and orientation, in a similar way to the HT. The line position is specified in terms of its distance 5p to the 
edge point pg. The line orientation is represented by 60, which represents the deviation with respect to 
the prominent direction angle 6n in &{pq). Thus, as illustrated in Fig. [5] any point p = {x, y) belonging 
to the line {5p, 5q) obeys 



where (•,•) is the standard inner product and Vq = (sin(^), — cos(^)) is a unit vector with directional 
orthogonal to 6. The candidate line segment is allowed to deviate at most a predefined Aj, from p^, i.e., 
6p E [— Ap, Ap], (in our experiments, we use Ap = 1) and Ag from On, i.e., 5q G [— Ag, Ag] (Ag was 
defined in the previous section). 



Fig. 3. Edge point Pq, prominent direction On, line segment specified by parameters {Sp,5g), and range limits Ap and Ag. 



(3) 
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When the estimate {5p, 60), coincides with the true value of the parameters defining a line segment in 
the image, there are several other edge points along the line (5p, 5q) for which the orientation 6 = 9n + 6g 
is also prominent (with matching signs of the image gradient), according to the information collected 
in {0(-)}. We call a candidate match to each of these edge points. Besides, due to the connectivity of 
the edge points forming a line, the gap between candidate matches must be smaller than a predefined 
maximum distance threshold d. Naturally, the closer the estimated line is to the actual one, more distant 
edge points of the true line segment are captured. This is the key point of our approach, which formalizes 
the extraction of a line segment as the search for the parameters {5p^5g) that maximize the length of the 
segment that can be extracted in the neighborhood of p^. 

To extract the maximum length segments that pass close to each edge point, we borrow inspiration in 
the LHT, where local accumulator arrays are used in contrast to the single accumulator array of the HT, 
which can not discriminate between distinct segments falling in the same line. In our case, we define 
local 2D length maps L : [— Ap, Ap] x [— Ag, Ag] 1— )• Nq. For each edge point, L{5p, 5g) will contain the 
integer length of the line segment {5p, 60). This map can be regarded as an extension of the accumulator 
array of a LHT in order to take line segment connectivity into account. In this scenario, the extraction of 
a line segment passing close to Pq consists of filling L{-, •), obtaining the parameters {Sp,6e) for which 
L{6p,S0) is maximum, and computing its start and end points. 

To fill L{-,-) through the direct implementation of an exhaustive scanning of the space [— Ap, Ap] x 
[— A51, Aq] would be computationally unbearable. In fact, the discretization of this space must be very 
fine to yield accurate results and, for each location, many edge points have to be processed (possibly, 
hundreds or thousands). Besides, this approach would use redundant computations, since each edge point 
would be visited several times because it may belong to several candidate line segments. For this reason, 
as usually done to fill the HT accumulator array, we also adopt a pixel-centered approach, where the 
edge points are used to fill the length map L{-,-) in an efficient way. 

B. Incorporating uncertainty — the update region 

We now show how each individual edge point is processed in our pixel-centered approach. Due to 
the pixel grid discretization, we model each edge point by an uncertainty ball, rather than a pointwise 
feature. We use the uncertainty ball radius R = I, the maximum expected error in the location of each 
pixel. Because the uncertainty ball has a non- infinitesimal area, there is a set of parameters {{6p,Sg)}, 
whose corresponding line segments cross it. This is illustrated in the left side of Fig. |4j where the lines 
formed by all orientations between 6a and Oh cross the uncertainty ball centered at {x,y), for position 
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deviation Sp = —a from Pq. We call update region of the length map domain to the set of positions and 
orientations {{Sp^Se)}, whose corresponding line segments cross the uncertainty ball centered at each 
edge pixel. The update region for the pixel p = in the length map of Pq, is shown on the right 

side of Fig. |4] 




Fig. 4. The uncertainty ball of an edge pixel (left) and the corresponding update region in the length map domain (right). 



To find the analytic expressions for the bounds of update region, we use the line equation (|3]), now 
seen as a condition for line segments rather than points. When computing the length map for the edge 
point Pq, we see from Q that any segment {5p, 6e) that crosses the uncertainty ball of pixel p = {x, y) 
must verify 

{P, = (Po> ve^+5e) + ('^p - ^) ' (4) 

where —R<r<Ris the distance between the center of the ball and the segment, along vector 
(depicted in Fig. |4]l. From Q, the Une segment position 6p is easily expressed in terms of its orientation 
6g and r: 

Sp= {p-P0^vj^+5e) + r- (5) 

The update region, which we denote by U, is thus the collection of intervals specified by all possible 
orientations 6^ and corresponding positions 6p given by (|5]l, with |r| < R: 

U = {{6p,6e) : 6e G [-Ae,Ae],6p G [6p {6e),6^{6e)] n [-Ap,Ap]} , (6) 

where the limits Sp{6g) and 6:^{6e) are made explicit from ^ by expanding 

W = {x - xq) sin {6n + 5g) - {y - yo) cos (6'„ + 5g) - R, (7) 
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^pi^e) = (x - xo) sin {On + de) - {y - yo) cos (6'„ + 60) + R. (8) 

If the update region of a given pixel is non-empty, we say that the pixel is within the search range. In 
the illustration of Fig. |4j the search range for is the one limited by the lines labelled with On — 
and 6'n + Ag. 

As geometrically evident, the range of angles of lines that cross an uncertainty ball decreases as the 
distance between pg p increases (an approximate expression for this range is 2 arcsin {R/\\p — Poll)' 
obtained by noting that Pq,p can be approximated by the hypotenuse of a right-angled triangle of which 
R is the small cathetus). As a consequence, to enable the extraction of long line segments, i.e., containing 
edge points p far from pg, the length map must be densely discretized to sample all relevant values of 
5e. We propose an efficient way to deal with this need through the hierarchical coarse-to-fine procedure 
described in the Section |lVl 

We observe that expressions ^ and ([8]l are similar to the parameterizing equation of the HT |flT|, 
p = xcos{9) + ysin(6'), with pg = {xo,yo) as the origin of the coordinate system and 6 shifted by 
90°. Thus, the boundaries of the update region resemble the sinusoidal shape of the bundles of votes of 
each edge point in the HT accumulator array (see Fig. [4] where, in fact, only a segment of that shape is 
seen, due to the length map limits). In what respects to the resolution of the accumulator array, when 
using the HT, the contradictory requirements of accuracy (high resolution) and coping with discretization 
error (low resolution, so that votes of the same line fall within the same bin) makes difficult, if not 
impossible, to achieve a good compromise. Strategies that uniformly blur the accumulation array (e.g., 
by using multiple resolutions fTSl . |[T9l . or kernels of various sizes JH) do not change the scenario, 
since they still neglect the distinct influence of the discretization error of edge points located at different 
positions. In opposition, in our case, the resolution of the length map can be chosen arbitrarily large, 
since we model the actual discretization error of each individual edge point by using the correspondent 
(position-dependent) update region, as descibed above. 

C. Sequential mapping of edge points to the length map 

After describing how each pixel maps to a corresponding update region in the length map, we now 
show how to fill this map in a sequential way by processing all image edge points. As before, let us 
consider the case the of the edge point pg, with prominent direction When filling the corresponding 
length map L, we consider the image divided in two half-planes by the line orthogonal to 9n passing 
through Pg. In each half-plane, starting from pg we circularly scan the image, with progressively larger 
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radius, mapping to the corresponding half-plane length map the candidate matches that fall within the 
search range and do not violate the connectivity requirement. 

Due to the graceful adaptation to the discrete pixel grid, we use the so-called Manhattan distance to 
define the equidistant curves as the set of pixels p located at fixed distance e from Pq (i.e., such that 
Hp ~ Polloo = e)- Fig- 15] illustrates the scenario, with the central edge point Pq, the equidistant curves, 
labeled by the distance values, the search range and the half-planes. 
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Fig. 5. Central edge point pg, search range, and equidistant curves, each labeled by its integer distance to Pq. 



For each half -plane, we scan each equidistant curve, starting with the one closer to (i.e., the curve 
with label e = 1 in Fig. |5]l, looking for candidate matches. Fig. [6] illustrates the scanning pattern for 
each equidistant curve. It starts in the center of the search range, i.e., the pixel labeled with in Fig. [6j 
and processes each pixel within the curve until reaching the limit of the search range (i.e., the pixels 
labeled with positive values in Fig. [6} up to label 4, shown in red). Then, the pixels in the other direction 
are scanned, until the complete equidistant curve within the search range was processed (i.e., the pixels 
labeled with negative values in Fig. [6j down to label —6, shown in red). Then, the following equidistant 
curve is processed. 

To account for the usage of the Manhattan distance, the discrete pixel [p] at the center of the search 
range for the equidistant curve e is given by 

[p] = round ( pg ± e ) , 

where = (cos(^), sin(^)) is a unit vector with angle 6, and the signal it depends on the half-plane 
being considered. 
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When a candidate match is found in the equidistant curve e, its update region is computed, according 
to ([6]), (|7]l, and ([8]l, and the corresponding entries of the length map are updated. This updating consists 
simply in setting those entries to the value of the equidistant curve number, e. This indicates that there 
are valid line segments of (at least) size e with the parameters corresponding to those entries. To capture 
the connected nature of the line segments, we first prune the update region, eliminating the locations 
where the difference between the current value of the length map, L, and the equidistance value e is 
larger than the maximum distance threshold d, i.e., 

U ^U\{[e-L{5p,5e)]>d,5p(^ [-Ap, Ap] , e [-Ae, A^]} . (9) 

Then, we update the length map according to 

L^eU + LQU , 

where U is seen as a binary mask and denotes the Hadamard, or elementwise, product. When the 
distance between e and all the values in the length map, L{-, •), is larger than d, i.e., when U = %, there 
are not updatable entries in the length map and the scanning stops for the corresponding half-plane. In 
the vast majority of our experiments, we used d = 2 pixels. 

Alg.[T] synthesizes the procedure just described to compute each half-plane length map. The final length 
map for each edge point pg and prominent direction On is obtained by adding the two half -plane length 
maps. 
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Algorithm 1 Filling one half -plane length map L for edge point and prominent direction On- 
1: input: Pq = {xq^uq), On, prominent directions {©(•)}, maximum distance d, uncertainty radius R 

2: L{-,-) = 0, e = 1 

3: repeat 



4: for [p] = (x, y) in the equidistant curve e (as illustrated in Fig. [6]) do 

5: if [p] is a candidate match (according to 0(p)) then 

6: [7 = 

7: for G [-A0,A0] do 

8: 8- = [x- xq) sin {On + 5e) - {y - yo) cos {On + 5g) - R 

9: 6+ = {x- xo) sin {On + (^e) - (y - yo) cos {On + 5g) + R 

10: U ^Uu{{6p,6e) -.SpG [6-,S+] n[-Ap,Ap]} 

11: end for 

12: U ^ U\ {e — L{-, •) > d} (requirement of line segment connectivity) 

13: L^eU + LQU 

14: end if 

15: end for 

16: e ^ e + 1 



17: until e — max{i(-, •)} > d 
18: output: L 



D. Extracting line segments 

As when detecting lines from the peaks of the HT accumulator array, we detect line segments passing 
through Pq with an orientation close to the prominent direction On by simply collecting position-orientation 
pairs lying in the range {6p, 60) € [— Ap, Ap] x [— Ag, Aq] that correspond to peaks in the corresponding 
length map L{-, •). 

Whenever a line segment is detected, with position-orientation parameters ((5p,5e), we also obtain in 
a straightforward way the coordinates of its extremes: 

= round (po + E± + 5pvl_,s}j , dO) 

where the subscript it differentiates both extremes and E± denotes the maximum values of the length 
map of the coiTcsponding half-planes. 
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To prevent multiple detections of a single line segment, each time a segment is detected for a central 
point Pq and prominent direction we remove that prominent direction from all candidate matches p 
in the line segment. Multiple crossing segments are naturally extracted by collecting position-orientation 
pairs in all prominent directions {9n, 1 < n < N}. To enable the detection of crossing segments with 
very close direction angles, a fine discretization of the angle histogram is required. Alternatively, we can 
use variable bin sizes for each prominent direction, in which case only (a small length interval around) 
the angle of an extracted line would be removed from all candidate matches p in the line segment. In 
the latter scenario, the corresponding length map may contain more than one peak, thus each one is dealt 
with independently. 

A final remark regards avoiding that a few edge points of lines with position-orientation parameters 
outside the range [— Ap, Ap] x [— Ag, A^/] vote for spurious lines inside that range. If fact, this would 
happen whenever the uncertainty balls of those edge points intersect that range, as illustrated in Fig. |7] 
We explicitly detect these cases and ignore them by using an orientation limit slightly larger than Ag 
(say, an increase of 2°) and only consider as detected segments those with estimated orientation within 
the original limits, i.e., 6e G [— Ag, Ag]. 
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Fig. 7. Edge points in a line outside the position-orientation range [— Ap, Ap] x [— Ag, Ag] and a spurious line segment, shown 
in green, that could be erroneously detected inside that range. 

Alg. [2] synthesizes the procedure to extract line segments, where the usage of non-maxima suppression 
in line [3] is not detailed, since it is similar to the standard procedure for extracting peaks from the 
accumulator array of the HT. The only difference is that, since STRAIGHT has detected all the candidate 
matches for each line segment, the parameters {5p, 5e) are more accurately estimated by fitting a line 
to the coordinates of the candidate matches with weights proportional to the magnitude of the image 
gradients. Naturally, other fitting criteria can easily be adopted in STRAIGHT. 
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Algorithm 2 Extracting line segments passing through pg with orientation close to On- 



1: 


input: Pq, On, half-plane length maps •) and L^{- 


, •), prominent directions {&{ 


•)}, angle range 




limit A51, uncertainty ball radius R 






2: 


repeat 






3: 


((5p, (^e) = argmax •) + •)] (fi^d ^'^d remove peak using non-maxima 


suppression) 


4: 


if < Ae then 






5: 


= round (pg + i:+((5p, e„+<5(,/ll^'e„+<5e lU 






6: 


= round (pg + X-((5p, (50)t;0„+5,/||7;0„+5j|oo 


+ W„+5j 




7: 


for the candidate matches p whose distance to [p_ 


] [p^] is smaller than (or equal to) R, remove 




from &{p) the entry corresponding to On 






8: 


end if 






9: 


until there are not prominent peaks in •) + 


•,•)] 




10: 


output: extremes of the extracted line segments, { [p_ 


\,[p+]}, and updated {©(•)} 





IV. Hierarchical Implementation 

Although the computational complexity of the pixel-centered approach described in the previous section 
is much smaller than an intensive approach, there are still some issues that need addressing. Because the 
discretization of L{-,-) must be fine, every time a new candidate match is found, a very large amount 
of positions in the length map need to be updated, which is a time-consuming operation. Furthermore, 
the number of pixels in the equidistant curves that fall within the search range and need to be scanned 
increases considerably with the size of the detected segment. Simultaneously, as the scanning of edge 
pixels proceeds and the corresponding updates are incorporated in the length map, the region of the map 
that remains updatable progressively becomes smaller. This occurs because more distant pixels correspond 
to smaller angle ranges, as explained in the previous section, and fewer {6p, 5e) positions still correspond 
to quasi-connected line segments. Since this narrowing of the updatable area was not taken into account 
in the previous section, most pixel checks are unnecessary and further computational cost optimizations 
are possible. This motivates the hierarchical implementation of STRAIGHT, as outlined in this section. 

Our hierarchical approach progressively zooms in on the updatable regions, thus increasing its dis- 
cretization density. The process starts with a length map that spans the initial wide location and angle 
ranges, as described in the previous section. Every time a set of equidistant curves are processed, the 
rectangular bounding box containing the updatable region (illustrated in the left image of Fig. [8]l is 
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upscaled, so that it takes up the complete length map (right image of Fig. [8]). This way, although the length 
map has a constant size, it progressively addresses narrower location and angle ranges around {6p,6g), 
effectively increasing the resolution of the estimates. Since the resolution can increase indefinitely, a 
coarse discretization of L{-,-) (we use an array of size 21 x 21) becomes sufficient to obtain long line 
segments and fewer pixels are tested, thus resulting in computationally efficient line segment extractions. 
Since, as described in the previous section, a length map may contain multiple disconnected updatable 
regions, corresponding to different line segments passing through pg, this process also divides the length 
map into multiple ones, each focused on a particular updatable region, and each estimation proceeds 
independently. 




Fig. 8. Illustration of the hierarchical implementation of STRAIGHT. Left: length map, with the bounding box of the updatable 
region. Right: the same region, after upscaling. 

To implement the length map upscaling in the hierarchical STRAIGHT, we use Nearest Neighbor 
interpolation. 

V. Experiments 

In the absence of an established database for benchmarking the performance of methods for line 
segment extraction, we single out demonstrative results of STRAIGHT, contrasting them with the ones 
obtained with the HT | fTT| and the state-of-the-art LSD |[T6l (the superiority of LSD when compared to 
several other methods is thoroughly demonstrated in |[T6l ). We first describe experiments with synthetic 
images to illustrate extreme cases that help to characterize the general behavior of STRAIGHT. Then, 
we present results obtained with several real world images that demonstrate its performance in practice. 
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A. Synthetic images 

We start by illustrating that STRAIGHT succeeds in cases tailored to the HT, i.e., when processing 
images for which the HT exhibits clear superiority with respect to local methods. We use again the 
synthetic image used in the Section |lj reproduced on the left of Fig. |9] This is a binary image used 
in a review of several HT-based line segment extraction methods |[20i . As we have anticipated in the 
Section |l| and in accordance with the conclusions of ll20ll . the HT succeeds in correctly extracting the 
lines from this image. In fact, although the multiple crossings make this image visually complex, the HT 
accumulator array exhibits the desired prominent peaks (see Fig. [TJ, capturing the fact that the lines are 
long and not in a very large number In the third image of Fig. [9j we show the results of LSD. That a 
pair of twin segments is extracted for each one in the original image is due to the fact that LSD treats the 
binary image as any other, i.e., as a grey-level one, and both light- to-dark and dark-to-light transitions 
are detected. However, what is more important is that the local nature of the LSD limits its performance, 
particularly in resolving the line intersections, making it fail the extraction of several complete segments 
that cross each other. The rightmost image displays the result of STRAIGHT, showing that it successfully 
extracts the majority of the line segments, regardless of the intersections (to make the comparison fair, 
we also processed the image as a grey-level one, originating the double-detection effect). 




Fig. 9. Clutterless image with prominent lines. From left to right: original binary image, result of the HT fl 11, LSD 1161 . and 
the proposed method STRAIGHT. 



We now illustrate the behavior of the algorithms when dealing with the other extreme of the spectrum, 
i.e., with images whose line segments are characterized by being frontiers of differently textured regions, 
rather than abrupt changes in a very smooth intensity level. We use the synthetic images in the left of 



Fig. 10 which were generated by adding noise to a piecewise constant map. The top image simulates 
a scenario where a textureless objects occludes a textured one {e.g., a wall in front of a tree) and 
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the bottom one simulates two textured objects. In both cases, although the line segments that separate 
regions are perceptually evident, they are not trivially mapped to the edges of the images. In fact, since the 
textures produce a large number of spurious edge points and the perceptual segments do not guaranteedly 
produce the corresponding edges, the HT ifTTI fails to extract them and originates a huge number of false 



detections, as shown in Fig. 10 Differently, LSD [T6\ succeeds in interpreting the textures as not forming 



line segments but only captures parts of the real segments for the top image and almost none for the 
bottom one. This is due to the local nature of LSD, which makes it sensitive to the missing edge points 



in the line segments. The rightmost images of Fig. 10 display the results of STRAIGHT, showing that 



it succeeds in extracting the perceptually relevant lines as forming, in both cases, four complete line 
segments (the few short segments correspond to accidental connected alignments in the random texture). 




\ 



_\ 




Fig. 10. Textured images. From left to right: original image, result of the HT |11|, LSD |16|, and STRAIGHT. 



B. Real images 

We start be showing the results obtained with the image used in Section |l] to clarify the limitations of 
the HT (Fig. [T]l. This image is challenging due to its dense packing of line segments of multiple lengths. 



In the top right image of Fig. 11 we display the results of LSD [16], showing that a subset of the line 
segments are in fact detected. However, a closer look reveals that those are only the line segments that 
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do not cross other structures and also that several longer segments are detected as fragmented ones. The 



results of STRAIGHT are in the two bottom images of Fig. 11 We see that our method succeeds in 
extracting the vast majority of the line segments in the image (exceptions are those which exhibit very 
low contrast). The fact that the extracted line segments are complete is particularly evident in the bottom 
right image, which displays only the line segments that have length greater than 50 pixels. 



J 

1 












m 




_3 : 




Fig. 11. Top left: image. Top right: LSD fl^. Bottom left: STRAIGHT. Bottom right: STRAIGHT (longer line segments). 



To illustrate how the noise affects the extraction of line segments in real images, we report the results 
obtained with noisy versions of the same image. Fig. [12] synthesizes the results for two levels of zero-mean 
white Gaussian noise. We see that, with the increase of the noise level, LSD |T6l originates more segment 
fragmentations and a progressive failure to detect some line segments. The performance of STRAIGHT 
declines in a less steep way, as expected from its global nature. 
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Fig. 13 presents another illustrative case. It was obtained by processing an image containing a complex 
scene occluded by a net composed of very long line segments that cross multiple times. The result of 
LSD ||T6l shows the net broken into short line segments (several sections of the net are not even extracted). 
On the other hand, our method was able to obtain almost all the complete line segments of the net, even 
in locations where the background is complex (exceptions are where the net has a very low contrast with 
respect to the background). The line segments extracted by our method that have length greater than 50 
pixels, displayed in the bottom right image of Fig. [131 make this particularly evident. 



Finally, Fig. 14 presents results of using STRAIGHT with real images of various kinds. As desired, 
the vast majority of long line segments are extracted without artificial fragmentation, despite the mul- 
tiple segment crossings. Also note that, although some of these images have edges that form curves, 
STRAIGHT succeeds in approximating these sections in a piecewise linear way, i.e., by a sequence of 
rectilinear line segments. 

VI. Conclusion 

We have presented a new method for line segment extraction, which we call STRAIGHT (Segment 
exTRAction by connectivity-enforcing Hough Transform). Our method inherits the global accuracy of 
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Fig. 13. Top left: image. Top right: LSD |16|. Bottom left: STRAIGHT. Bottom right: STRAIGHT (longer line segments). 

the HT and overcomes its limitations, particularly those that arise from not taking into account that line 
segments are connected sets of edge points. Our experiments show that STRAIGHT outperforms current 
methods for line segment extraction in challenging situations, e.g., when dealing with complex images 
containing several crossing segments. 

We end by pointing out that our approach may pave the way to other improvements in HT-like image 
edge analysis. In fact, as we saw, the HT leads to erroneous votes, which are eliminated by taking 
point connectivity into account. Thus, the detection of non-rectilinear shapes, e.g., circles, in challenging 
scenaiios, may also benefit from a similar treatment. 
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Fig. 14. Results of STRAIGHT for several kinds of real images. 
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